174 research outputs found

    Evaluation of String Distance Algorithms for Dialectology

    Get PDF

    Afrikaans and Dutch as closely-related languages: A comparison to West Germanic languages and Dutch dialects

    Get PDF
    Following Den Besten‟s (2009) desiderata for historical linguistics of Afrikaans, this article aims to contribute some modern evidence to the debate regarding the founding dialects of Afrikaans. From an applied perspective (i.e. human language technology), we aim to determine which West Germanic language(s) and/or dialect(s)  would be best suited for the purposes of recycling speech resources for the benefit of developing speech  technologies for Afrikaans. Being recognised as a West Germanic language, Afrikaans is first compared to  Standard Dutch, Standard Frisian and Standard German. Pronunciation distances are measured by means of  Levenshtein distances. Afrikaans is found to be closest to Standard Dutch. Secondly, Afrikaans is compared to 361 Dutch dialectal varieties in the Netherlands and North-Belgium, using material from the Reeks  Nederlandse Dialectatlassen, a series of dialect atlases compiled by Blancquaert and Pée in the period  1925-1982 which cover the Dutch dialect area. Afrikaans is found to be closest to the South-Holland dialectal variety of Zoetermeer; this largely agrees with the findings of Kloeke (1950). No speech resources are  available for Zoetermeer, but such resources are available for Standard Dutch. Although the dialect of  Zoetermeer is significantly closer to Afrikaans than Standard Dutch is, Standard Dutch speech resources might be a good substitute.Keywords: human language technologies, speech resources, Afrikaans, Dutch, acoustic distanc

    The relationship between first language acquisition and dialect variation:Linking resources from distinct disciplines in a CLARIN-NL project

    Get PDF
    AbstractIt is remarkable that first language acquisition and historical dialectology should have remained strange bedfellows for so long considering the common assumption in historical linguistics that language change is due to the process of non-target transmission of linguistic features, forms and structures between generations, and thus between parents or adults and children. Both disciplines have remained isolated from each other due to, among other things, different research questions, methods of data-collection and types of empirical resources. The aim of this paper is to demonstrate that the common assumption in historical linguistics mentioned above can be examined with the help of Digital Humanities projects like CLARIN. CLARIN infrastructure makes it possible to carry out e-Humanities type research by combining datasets from distinct disciplines through tools for data processing. The outcome of the CLARIN-NL COAVA-project (acronym of: Cognition, Acquisition and Variation tool) allows researchers to access two datasets from two different sub disciplines simultaneously, namely Dutch first child language acquisition files located in Childes (MacWhinney, 2000) and historical Dutch Dialect Dictionaries through the development of a tool for easy exploration of nouns

    Estimating the Level and Direction of Phonetic Dialect Change in the Northern Netherlands

    Get PDF
    This article reports ongoing investigations into phonetic change of dialect groups in the northern Netherlandic language area, particularly the Frisian and Low Saxon dialect groups, which are known to differ in vitality. To achieve this, we combine existing phonetically transcribed corpora with dialectometric approaches that allow us to quantify change among older male dialect speakers in a real-time framework. A multidimensional variant of the Levenshtein distance, combined with methods that induce realistic phonetic distances between transcriptions, is used to estimate how much dialect groups have changed between 1990 and 2010, and whether they changed towards Standard Dutch or away from it. Our analyses indicate that language change is a slow process in this geographical area. Moreover, the Frisian and Groningen dialect groups seem to be most stable, while the other Low Saxon varieties (excluding the Groningen dialect group) were shown to be most prone to change. We offer possible explanations for our findings, while we discuss shortcomings of the data and approach in detail, as well as desiderata for future research
    corecore